home *** CD-ROM | disk | FTP | other *** search
-
- Guide to Porting lsof 3 to Unix OS Dialects
-
- **********************************************************************
- | The latest release of lsof is always available via anonymous ftp |
- | from vic.cc.purdue.edu. Look in pub/lsof.README for its location. |
- **********************************************************************
-
- Contents
-
- General Guidelines
- Organization
- Source File Naming Conventions
- Coding Philosophies
- Data Requirements
- Dlsof.h and #include's
- Definitions That Affect Compilation
- Options: Common and Special
- Defining Dialect-Specific Symbols and Global Storage
- Coding Dialect-specific Functions
- Function Prototype Definitions and the _PROTOTYPE Macro
- The Makefile
- The Mksrc Shell Script
-
-
- General Guidelines
- ------------------
-
- These are the general guidelines for porting lsof 3 to a new Unix
- dialect:
-
- * Understand the organization of the lsof sources and the
- philosophies that guide their coding.
-
- * Understand the data requirements and determine the methods
- of locating the necessary data in the new dialect's kernel.
-
- * Pick a name for the subdirectory in lsof3/dialects for your
- dialect.
-
- * Locate the necessary header files and #include them in the
- dialect's dlsof.h file. (You may not be able to complete
- this step until you have coded all dialect-specific functions.)
-
- * Determine the optional common functions of lsof to be used
- and set their definitions in the dialect's machine.h file.
-
- * Define the dialect's specific symbols and global storage
- in the dialect's dlsof.h and dstore.c files.
-
- * Code the dialect-specific functions in the appropriate
- source files of the dialect's subdirectory. Select appropriate
- common code fragments from lsof3/dialects/common.
-
- Include the necessary prototype definitions of the dialect-
- specific functions in the dproto.h file in the dialect's
- subdirectory.
-
- * Define the dialect's Makefile and source construction shell
- script, Mksrc. Include in Mksrc the steps necessary to
- construct source files from common code fragments.
-
- Organization
- ------------
-
- The code in a dialect-specific version of lsof comes from three
- sources:
-
- 1) functions common to all versions, located in the top level
- directory, lsof3;
-
- 2) functions specific to the dialect, located in the dialect's
- subdirectory -- e.g., lsof3/dialects/sun;
-
- 3) functions that are common to several dialects, although
- not to all, contained in code fragment files, located in
- a common subdirectory -- e.g., lsof3/dialects/common/rdev.frag.
-
- The tree looks like this:
-
- lsof3
- | \
- 1) fully common functions + \
- e.g., lsof3/main.c + lsof3/dialects
- / / / / \
- + + + + \
- 2) dialect-specific subdirectories + 3) common code fragments -- e.g.,
- -- e.g., lsof3/dialects/sun lsof3/dialects/common/rdev.frag
-
- The code for a dialect-specific version is assembled from these
- three sources by the Configure shell script in the top level
- directory. It calls on the Mksrc shell script in each dialect's
- subdirectory to assemble the dialect-specific sources. That assembly
- can be simply creating a symbolic link from the top level to the
- dialect's subdirectory, or it can be a process of combining a
- dialect-specific source file with code fragment files from the
- lsof3/dialects/common subdirectory.
-
- The Configure script completes the dialect's Makefile by adding
- string definitions to it while copying it from the dialect's
- subdirectory to the top level.
-
-
- Source File Naming Conventions
- ------------------------------
-
- With one exception, dialect-specific source files begin with a
- lower case `d' character -- ddev.c, dfile.c, dlsof.h. The one
- exception is the header file that contains dialect-specific
- definitions for the optional features of the common functions.
- It's called machine.h for historical reasons.
-
- Currently all dialects use almost the same source file names. One
- exception to the rule happens in dialects where there must be
- different source files -- e.g., dnode[123].c -- to eliminate node
- header file structure element name conflicts. The source modules
- in the dcosx and novell subdirectories are organized this way.
-
- These are common files in lsof3/:
-
- Configure the configuration script
- version the version number
- dialects/ the dialects subdirectory
-
- These are the common function source files in lsof3/:
-
- arg.c common argument processing functions
- lsof.h common header file that #include's the dialect-specific
- header files
- main.c common main function for lsof 3
- misc.c common miscellaneous functions -- e.g., special versions
- of stat() and readlink()
- node.c common node reading functions -- readinode(), readvnode()
- print.c common print support functions
- proc.c common process and file structure functions
- proto.h common prototype definitions, including the definition of
- the _PROTOTYPE() macro
- store.c common global storage version.h the current lsof version
- number, derived from the file version by the Makefile
-
- These are the dialect-specific source files:
-
- ddev.c device support functions -- readdev()
- dfile.c file processing functions -- commonly this file uses
- code fragments from lsof3/dialects/common
- dlsof.h dialect-specific header file -- contains #include's
- for system header files and dialect-specific global
- storage declarations
- dmnt.c mount support functions -- commonly this file uses code
- fragments from lsof3/dialects/common
- dnode.c node processing functions -- e.g., for gnode or vnode
- dnode?.c additional node processing functions, used when node
- header files have duplicate and conflicting element
- names.
- dproc.c functions to access, read, examine and cache data about
- dialect-specific process structures -- this file contains
- the dialect-specific "main" function, gather_proc_info()
- dproto.h dialect-specific prototype declarations
- dsock.c dialect-specific socket processing functions
- dstore.c dialect-specific global storage -- e.g., the nlist()
- structure
- machine.h dialect specific definitions of common function options --
- e.g., a HASINODE definition to activate the readinode()
- function in lsof3/node.c
-
- These are the common code fragments. The file common/00Manifest
- describes the fragments and the dialects that use them. Note that
- some fragments may be configured with #define statements. Consult
- the comments at the beginning of the fragments and check the methods
- that existing dialects use to configure them.
-
- ckfa.frag ck_file_arg() function
- cvfs.frag completevfs() function
- dvch.frag read_dcache() and write_dcache() functions for
- handling the device cache file (these functions
- may be configured slightly)
-
- +=================================================+
- | IF YOUR DIALECT MUST BE SETUID(ROOT), MAKE SURE |
- | YOU DEFINE DVCH_CHOWN OR DVCH_FCHOWN. |
- +=================================================+
-
- isfn.frag is_file_named() function
- pcdn.frag printchdevname() function
- prfp.frag process_file() function
- prtf.frag print_file() function
- rdev.frag contains readdev() and stkdir() functions that may
- be configured slightly
- rmnt.frag readmnt() function
- rnam.frag BSD-style name cache functions, ncache_*() that may
- be configured slightly
- rnch.frag SYSV-style name cache functions, ncache_*() that may
- be configured slightly
- rvfs.frag readvfs() function
-
-
- Coding Philosophies
- -------------------
-
- A few basic philosophies govern the coding of lsof 3 functions:
-
- * Use as few #if/#else/#endif constructs as possible, even at
- the cost of nearly-duplicate code.
-
- When #if/#else/#endif constructs are necessary:
-
- o Use the form
-
- #if defined(s<symbol>)
-
- in preference to
-
- #ifdef <symbol>
-
- to allow easier addition of tests to the #if.
-
- o Indent them to signify their level -- e.g.,
-
- #if /* level one */
- # if /* level two */
- # endif /* level two */
- #else /* level one */
- #endif /* level one */
-
- o Use ANSI standard comments on #else and #endif statements.
-
- * Document copiously.
-
- * Aim for ANSI-C compatibility:
-
- o Use function prototypes for all functions, hiding them
- from compilers that cannot handle them with the _PROTOTYPE()
- macro.
-
- o Use the compiler's ANSI conformance checking wherever
- possible -- e.g., gcc's -ansi option.
-
-
- Data Requirements
- -----------------
-
- Lsof's strategy in obtaining open file information is to access
- the process table via its proc structures, then obtain the associated
- user area and open file structures. The open file structures then
- lead lsof to file type specific structures -- cdrnodes, fifonodes,
- inodes, gnodes, hsfsnodes, pipenodes, pcnodes, rnodes, snodes,
- sockets, tmpnodes, and vnodes.
-
- This means that to begin an lsof port to a new Unix dialect you
- must understand how to obtain these structures from the dialect's
- kernel. Look for kernel access functions -- e.g., the AIX readx()
- function, Sun and Sun-like kvm_*() functions, or SGI's syssgi()
- function. Look for clues in header files -- e.g. external declarations
- and macroes.
-
- If you have access to them, look at sources to programs like ps(1),
- or the freely available monitor and top programs. They may give
- you important clues on reading proc and user area structures. An
- appeal to readers of dialect-specific news groups may uncover
- correspondents who can help.
-
- Careful reading of system header files -- e.g., <sys/proc.h> --
- may give hints about how kernel storage is organized. Look for
- global variables declared under a KERNEL or _KERNEL #if. Run nm(1)
- across the kernel image (/vmunix, /unix, etc.) and look for references
- to structures of interest.
-
- Even if there are support functions for reading structures, like the
- kvm_*() functions, you must still understand how to read data from
- kernel memory. Typically this requires an understanding of the
- nlist() function, and how to use /dev/kmem, /dev/mem, and /dev/swap.
-
- Don't overlook the possibility that you may have to use the process
- file system -- e.g., /proc. Look at the Motorola V/88 R40V4.2 and
- Novell UnixWare dialects for examples. I try to avoid using /proc
- when I can, since it usually requires that lsof have setuid(root)
- permission to read the individual /proc "files".
-
- Once you can access kernel structures, you must understand how
- they're connected. You must answer questions like:
-
- * How are the proc structures organized? Is is a static
- table? Are the proc structures linked? Is there a
- kernel pointer to the first proc structure? Is there a
- proc structure count?
-
- * If this is a Mach derivative, is it necessary to obtain the
- task and thread structures? How?
-
- * How does one obtain the user area (or the utask area in Mach
- systems) that corresponds to a process?
-
- * Where are the file structures located for open file
- descriptors and how are they located? Are all file
- structures in the user area? Is the file structure space
- extensible?
-
- * Where do the private data pointers in file structures lead?
- To gnodes? To inodes? To sockets? To vnodes? Hint: look
- in <sys/file.h> for DTYPE_* instances and further pointers.
-
- * How are the nodes organized? To what other nodes do they
- lead and how? Where are the common bits of information in
- nodes -- device, node number, size -- stored? Hint: look
- in the header files for nodes for macroes that may be used
- to obtain the address of one node from another -- e.g., the
- VTOI() macro that leads from a vnode to an inode.
-
- * Are text reference nodes identified and how? Is it
- necessary to examine the virtual memory map of a process or
- a task to locate text references? Some kernels have text
- node pointers in the proc structures; some, in the user
- area; Mach kernels may have text information in the task
- structure, reached in various ways from the proc, user area,
- or user task structure.
-
- * How is the device table -- e.g., /dev or /devices --
- organized? How is it read? Using direct or dirent structures?
-
- How are major/minor device numbers represented? How are
- device numbers assembled and disassembled?
-
- Are there clone devices? How are they identified?
-
- * How is mount information obtained? Getmntinfo()? Getmntent()?
- Some special kernel call?
-
- * How are sockets identified and organized? BSD-style? As
- streams? Are there streams?
-
- * Are there special nodes -- CD-ROM nodes, FIFO nodes, etc.?
-
- * How is the kernel's name cache organized? Can lsof access
- it to get partial name components?
-
-
- Dlsof.h and #include's
- ----------------------
-
- Once you have identified the kernel's data organization and know
- what structures it provides, you must add #include's to dlsof.h to
- access their definitions. Sometimes it is difficult to locate the
- header files -- you may need to introduce -I specifications in the
- Makefile via the DINC shell variable in the Configure script.
-
- Sometimes it is necessary to define special symbols -- e.g., KERNEL,
- _KERNEL, _KMEMUSER -- to induce system header files to yield kernel
- structure definitions. Sometimes making those symbol definitions
- cause other header file and definition conflicts. There's no good
- general rule on how to proceed when conflicts occur.
-
- Rarely it may be necessary to extract structure definitions from
- system header files and move them to dlsof.h, create special versions
- of system header files, or obtain special copies of system header
- files from "friendly" (e.g., vendor) sources. The dlsof.h header
- file in lsof3/dialects/sun shows examples of the first case; the
- dec_a subdirectory in lsof3/dialects/osf, the second; the irix5hdr
- subdirectory in lsof3/dialects/sgi, a mixture of the first and
- third.
-
- Building up the necessary #includes in dlsof.h is an iterative
- process that requires attention as you build the dialect-specific
- functions that references kernel structures. Be prepared to revisit
- dlsof.h frequently.
-
-
- Definitions That Affect Compilation
- -----------------------------------
-
- The source files at the top level contain optional functions that
- may be activated with definitions in a dialect's machine.h header
- file. Mostly these are functions for reading node structures that
- may not apply to all dialects -- e.g. CD-ROM nodes (cdrnode), or
- `G' nodes (gnode). Once you understand your kernel's data
- organization, you'll be able to decide the optional common node
- functions to activate.
-
- Definitions in machine.h and dlsof.h also enable or disable other
- optional common features:
-
- HASDCACHE enables the use of the device file cache. It
- contains information about the names, device
- numbers and inode numbers of entries in the
- /dev node subtree that lsof saves from call
- to call.
- HASCDRNODE enables/disables readcdrnode() in node.c
- HASFIFONODE enables/disables readfifonode() in node.c
- HASFSTYPE enables/disables the use of the file system
- type as reported in some stat(2) structures.
- HASGNODE enables/disables readgnode() in node.c
- HASHSNODE enables/disables readhsnode() in node.c
- HASINODE enables/disables readinode() in node.c
- HASINTSIGNAL is defined when signal() returns an int
- HASKOPT enables/disables the ability to read the
- kernel's name list from a file -- e.g., from
- a crash dump file.
- HASMOPT enables/disables the ability to read kernel
- memory from a file -- e.g., from a crash
- dump file.
- HASNCACHE enables the probing of the kernel's name cache
- to obtain path name components.
- HASNLIST enables/disables nlist() function support.
- (See NLIST_TYPE.)
- HASPINFO defines the name (if any) of the process
- information subdirectory of the process file
- system -- e.g., /proc/pinfo. When HASPINFO is
- defined, HASPROCFS should also be defined.
- HASPIPENODE enables/disables readpipenode() in node.c
- HASPROCFS defines the name (if any) of the process file
- system -- e.g., /proc. When HASPROCFS is
- defined, it may also be necessary to define
- HASPINFO.
- HASPWSTAYOPEN enables/disables support of a method to keep
- /etc/passwd open during a sequence of lookup
- operations.
- HASRNODE enables/disables readrnode() in node.c
- HASSECURITY enables/disables restricting open file
- information access.
- HASSTREAMS enables/disables streams.
- HASTMPNODE enables/disables readtnode() in node.c
- HASVNODE enables/disables readvnode() function in node.c
- HASXOPT defines help text for dialect-specific X option
- and enables X option processing in arg.c and main.c
- HASXOPT_VALUE defines the default binary value for the X option
- in store.c
- MACH defines a MACH system.
- NLIST_TYPE is the type of the nlist table, Nl[], if it is
- not nlist. HASNLIST must be set for this
- definition to be effective.
- UID_ARG_T defines the cast on a User ID when passed as
- a function argument.
- WARNDEVACCESS enables the issuing of a warning message when
- lsof is unable to access /dev (or /device) or
- one of its subdirectories. Some dialects (e.g.,
- HP-UX) have many inaccessible subdirectories and
- it is easier to inhibit the warning for them.
- The -w option will also inhibit these warnings.
- zeromem() defines a macro to zero memory -- e.g., using
- bzero() or memset().
-
- Any dialect's machine.h can serve as a template for building your
- own. All machine.h files usually have all definitions, disabling
- some (with comment prefix and suffix) and enabling others.
-
-
- Options: Common and Special
- ---------------------------
-
- All but one lsof option is common; the specific option is ``-X''.
- If a dialect does not support a common option, the related #define
- in machine.h -- e.g., HASCOPT -- should be deselected.
-
- The specific option, ``-X'', may be used by any dialect for its
- own purpose. Right now (May 30, 1995) the ``-X'' option is binary
- (i.e., it's not allowed arguments of its own, and its value must
- be 0 or 1) but that could be changed should the need arise. The
- option is enabled with the HASXOPT definition in machine.h; its
- default value is defined by HASXOPT_VALUE.
-
- The value of HASXOPT should be the text displayed for ``-X'' by
- the usage() function in arg.c. HASXOPT_VALUE should be the default
- value, 0 or 1.
-
- AIX for the IBM RICS System/6000 defines the ``-X'' option to
- control readx() usage, since there is a bug in AIX 3.2.x kernels
- that readx() can expose for other processes.
-
-
- Defining Dialect-Specific Symbols and Global Storage
- ----------------------------------------------------
-
- A dialect's dlsof.h and dstore.c files contain dialect-specific
- symbol and global storage definitions. There are symbol definitions,
- for example, for function and data casts, and for file paths.
- Consult any dialect's dlsof.h file and convert its "Miscellaneous
- definitions" section to your dialect. Dslof.h defines index symbols
- for the nlist() table -- X_* symbols -- when it's being used.
-
- Global storage definitions include such things as structures for
- local Virtual File System (vfs) information; mount information;
- search file information; and kernel memory file descriptors --
- e.g., Kmem for /dev/kmem, Mem for /dev/mem, Swap for /dev/drum.
-
-
- Coding Dialect-specific Functions
- ---------------------------------
-
- Each supported dialect must have some basic functions that the
- common functions of the top level may call. Some of them may be
- obtained from the code fragment files in lsof3/dialects/common.
- Others may have to be coded specifically for the dialect.
-
- Each supported dialect usually has private functions, too. Those
- are wholly determined by the needs of the dialect's data organization
- and access.
-
- These are the basic functions that each dialect must supply:
-
- ck_file_arg() function to check optional file name
- arguments
- initialize() function to initialize the dialect
- is_file_named() function to check if a file was named
- by an optional file name argument
- gather_proc_info() function to gather process table
- and related information and cache it
- printchdevname() function to locate and optionally
- print the name of a character device
- print_file() function to print open file information
- process_file() function to process an open file
- structure
- process_node() function to process a primary node
- process_socket() function to process a socket
- readdev() and stkdir() functions to read and cache device
- information
- readmnt() function to read mount table information
-
- Check the code fragments in lsof3/dialects/common and specific
- lsof3/dialects/* files for examples.
-
- As you build these functions you will probably have to add #include's
- to dlsof.h.
-
-
- Function Prototype Definitions and the _PROTOTYPE Macro
- -------------------------------------------------------
-
- Once you've defined your dialect-specific definitions, you should
- define their prototypes in dproto.h or locally in the file where
- they occur and are used. Do this even if your compiler is not ANSI
- compliant -- the _PROTOTYPE macro knows how to cope with that and
- will avoid creating prototypes that will confuse your compiler.
-
-
- The Makefile
- ------------
-
- Use an existing dialect's Makefile as a template. Be alert for
- special facilities to generate header file dependencies (see the
- sgi Makefile). Make sure that installation locations are appropriate.
- Change the GRP string accordingly. Make sure that the install
- program options are correct. Use the DEBUG string to set debugging
- options, like ``-g''. You may also need to use the -O option when
- forking and SIGCHLD signals defeat your debugger as they do under
- Motorola V/88.
-
- Finally, remember that strings can be passed from the top level's
- Configure shell script. That's an appropriate way to handle options,
- especially if there are multiple versions of the Unix dialect to
- which you are porting lsof 3.
-
-
- The Mksrc Shell Script
- ----------------------
-
- Pattern your Mksrc shell script after an existing one from another
- dialect. Change the D shell variable to the name of your dialect's
- subdirectory in lsof3/dialects. Adjust any other shell variable
- to your local conditions. (Probably that won't be necessary.)
-
- Finally, add sections to assemble your modules that use fragments
- from ../common.
-
- Note that, if using symbolic links from the top level to your
- dialect subdirectory is impossible or impractical, you can set the
- MKC shell variable in Configure to something other than "ln -s" --
- e.g., "cp," and Configure will pass it to the Mksrc shell script
- in the M environment variable.
-
-
- Vic Abell <abe@cc.purdue.edu>
- Purdue University Computing Center
- July 20, 1995
-